Learning spectro-temporal representations of complex sounds with parameterized neural networks
نویسندگان
چکیده
Deep Learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes on a variety of tasks. Yet, these often lack interpretability fully understand the exact computations that been performed. Here, we proposed parametrized neural network layer, computes specific spectro-temporal modulations based Gabor kernels (Learnable STRFs) and is interpretable. We evaluated predictive capabilities this layer Speech Activity Detection, Speaker Verification, Urban Sound Classification Zebra Finch Call Type Classification. found out Learnable STRFs are par all tasks with different toplines, obtain best performance Detection. As interpretable, used quantitative measures describe distribution learned modulations. The filters adapted each task focused mostly low temporal spectral analyses show human speech similar parameters as ones measured directly in cortex. Finally, observed organized meaningful way: vocalizations closer other bird far away from urban sounds
منابع مشابه
Learning nonnegative features of spectro-temporal sounds for classification
In this paper we present a method of sound classification which exploits a parts-based representation of spectrotemporal sounds, employing the nonnegative matrix factorization (NMF) [1]. We illustrate a new way of learning nonnegative features using a variant of NMF and show its useful behavior in the task of general sound classification with comparison to independent component analysis (ICA) w...
متن کاملLearning Anonymized Representations with Adversarial Neural Networks
Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel informationtheoretical bounds. We introduce a novel training objective for simultaneously training ...
متن کاملNonnegative features of spectro-temporal sounds for classification
A parts-based representation is a way of understanding object recognition in the brain. The nonnegative matrix factorization (NMF) is an algorithm which is able to learn a parts-based representation by allowing only non-subtractive combinations (Lee and Seung, 1999). In this paper we incorporate a parts-based representation of spectro-temporal sounds into the acoustic feature extraction, which ...
متن کاملLearning to localise sounds with spiking neural networks
To localise the source of a sound, we use location-specific properties of the signals received at the two ears caused by the asymmetric filtering of the original sound by our head and pinnae, the head-related transfer functions (HRTFs). These HRTFs change throughout an organism’s lifetime, during development for example, and so the required neural circuitry cannot be entirely hardwired. Since H...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the Acoustical Society of America
سال: 2021
ISSN: ['0001-4966', '1520-9024', '1520-8524']
DOI: https://doi.org/10.1121/10.0005482